Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpod: improve heuristic to detect cgroup #12403

Merged

Conversation

giuseppe
Copy link
Member

improve the heuristic to detect the scope that was created for the container.

This is necessary with systemd running as PID 1, since it moves itself to a different sub-cgroup, thus stats would not account for other processes in the same container.

Closes: #12400

Signed-off-by: Giuseppe Scrivano [email protected]

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 24, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2021
Copy link
Collaborator

@flouthoc flouthoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cgroupPath has a early check to fail if container is not running but inspect can be performed on a container which is not running. Here: https://github.com/containers/podman/blob/main/libpod/container.go#L921

Maybe we could remove that check and return empty string if container is not running.

PS: Following error line will break tests/usage for inspect called on non-running containers

@giuseppe giuseppe force-pushed the improve-cgroup-detection branch from d8f180e to a6845b7 Compare November 24, 2021 11:22
@giuseppe
Copy link
Member Author

Maybe we could remove that check and return empty string if container is not running.

yes good idea, fixed and pushed a new version

@giuseppe giuseppe force-pushed the improve-cgroup-detection branch 2 times, most recently from 8fd39be to cc6f6a3 Compare November 24, 2021 12:43
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

OCI runtimes may set the memory limits in different ways, e.g., crun
creates a sub-cgroup where the limits are applied, while runc applies
them directly on the created cgroup.  Since there is standardization
on the cgroup path to use, just use the limit specified in the spec
file.

Signed-off-by: Giuseppe Scrivano <[email protected]>
Signed-off-by: Giuseppe Scrivano <[email protected]>
improve the heuristic to detect the scope that was created for the container.
This is necessary with systemd running as PID 1, since it moves itself
to a different sub-cgroup, thus stats would not account for other
processes in the same container.

Closes: containers#12400

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the improve-cgroup-detection branch from cc6f6a3 to e648122 Compare November 24, 2021 13:50
func getMemLimit(cgroupLimit uint64) uint64 {
// getMemory limit returns the memory limit for a container
func (c *Container) getMemLimit() uint64 {
memLimit := uint64(math.MaxUint64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

99% sure this is a break from Docker's behavior - we should report a limit of 0 for containers that do not have a resource limit, not the maximum amount of RAM on the current system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it after a test was failing.

The current Podman behavior, AFAICS, is to report the maximum amount of RAM:

$ /usr/bin/podman run --rm -ti -d alpine top
58efadc49b77ef300eccb02715235616fb5729135198665f5ca7745006f93940
$ /usr/bin/podman stats -l --no-reset --no-stream
ID            NAME          CPU %       MEM USAGE / LIMIT  MEM %       NET IO      BLOCK IO    PIDS        CPU TIME    AVG CPU %
58efadc49b77  crazy_fermat  4.40%       233.5kB / 33.39GB  0.00%       -- / --     -- / --     1           7.2ms       4.40%

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify that podman inspect still reports 0 if not limited? If so, I'm fine with this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They return the same values:

$ diff <(bin/podman inspect -l | grep -i memory) <(/usr/bin/podman inspect -l | grep -i memory) | wc -l
0

@rhatdan
Copy link
Member

rhatdan commented Nov 25, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 25, 2021
@openshift-merge-robot openshift-merge-robot merged commit 12f73d5 into containers:main Nov 25, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

podman stats reports only the container's PID 1 information, which is not so useful for systemd containers
6 participants